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What is claimed is: 

1 1 . A method of displaying text information corresponding to a speech 

2 portion of audio signals of a television program to as a closed caption on an video display 

3 device, the method comprising the steps of: 

4 decoding the audio signals of the television program; 

5 filtering the audio signals to extract the speech portion; 

6 parsing the speech portion into discrete speech components in accordance with 

7 a speech model and grouping the parsed speech components; 

8 identifying words in a database corresponding to the grouped speech 

9 components; and 

10 converting the identified words into text data for display on the display device 

11 as the closed caption. 

1 2. A method according to claim 1, wherein the step of filtering the audio 

2 signals is performed concurrently with the step of decoding of later-occurring audio signals 

3 of the television program and step of parsing of earlier occurring speech signals of the 

4 television program. 

1 3. A method according to claim 1, wherein the step of parsing the speech 

2 portion into discrete speech components includes the step of employing a speaker 

3 independent model to provide individual words as the parsed speech components. 

1 4. A method according to claim 1 further including the step of formatting 

2 the text data into lines of text data for display in a closed caption area of the display device. 

1 5. A method according to claim 1, wherein the step of parsing the speech 

2 portion into discrete speech components includes the step of employing a speaker dependent 

3 model to provide phonemes as the parsed speech components. 
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1 6. A method according to claim 5, wherein the speaker dependent model 

2 employs a hidden Markov model and the method further comprises the steps of: 

3 receiving a training text as a part of the television signal, the training text 

4 corresponding to a part of the speech portion of the audio signals; 

5 updating the hidden Markov model based on the training text and the part of 

6 the speech portion of the audio signals corresponding to the training text; and 

7 applying the updated hidden Markov model to parse the speech portion of the 

8 audio signals to provide the phonemes. 

pi 7. A method of displaying text information corresponding to a speech portion 

2 of audio signals of a television program to as a closed caption on an video display device, the 

3 method comprising the steps of: 

4 decoding the audio signals of the television program; 

5 filtering the audio signals to extract the speech portion; 



6 receiving a training text as a part of the television signal, the training text 

7 corresponding to a part of the speech portion of the audio signals; 

8 generating a hidden Markov model from the training text and the part of the 

9 speech portion of the audio signals; 

10 parsing the audio speech signals into phonemes based on the generated Hidden 

1 1 Markov model; 

12 identifying words in a database corresponding to grouped phonemes; and 
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converting the identified words into text data for presentation on the display of 
the audio-visual device as closed captioned textual data. 
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1 8. A method according to claim 7, wherein the step of filtering the audio 

2 signals is performed concurrently with the step of decoding of later-occurring audio signals 

3 of the television program and step of parsing of earlier occurring speech signals of the 

4 television program. 

1 9. A method according to claim 7 further including the step of formatting 

2 the text data into lines of text data for display in a closed caption area of the display device. 

1 10. A method according to claim 7, further comprising the step of 

2 providing respective audio speech signals and training texts for each speaker of a plurality of 

3 speakers on the television program. 

l 11. Apparatus for displaying text information corresponding to a speech 

!5j 2 portion of audio signals of a television program to as a closed caption on an video display 

a ; H 

Pi 3 device, the method comprising: 

4 a decoder which separates the audio signals from the television program 

5 signals; 

£ i s 

f y 6 a speech filter which identifies portions of the audio signals that include speech 

7 components and separates the identified speech component signals from the audio signals; 

8 a phoneme generator which parses the speech portion into phonemes in 

9 accordance with a speech model; 

10 a database of words, each word being identified as corresponding to a discrete 

11 set of phonemes; 

12 a word matcher which groups the phonemes provided by the phoneme 

13 generator and identifies words in the database corresponding to the grouped phonemes; and 
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a formatting processor that converts the identified words into text data for 
display on the display device as the closed caption. 
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12. Apparatus according to claim 11, wherein the speech filter, the decoder 
and the phoneme generator are configured to operate in parallel. 

13. Apparatus according to claim 11, wherein the phoneme generator 
includes a speaker independent speech recognition system. 

14. Apparatus according to claim 11, wherein the phoneme generator 
includes a speaker dependent speech recognition system. 

15. Apparatus according to claim 14, wherein the speech model includes a 
hidden Markov model and the phoneme generator further comprises: 

means for receiving a training text as a part of the television signal, the 
training text corresponding to a part of the speech portion of the audio signals; 

means for updating the hidden Markov model based on the training text and 
the part of the speech portion of the audio signals corresponding to the training text; and 

means for applying the updated hidden Markov model to parse the speech 
portion of the audio signals to provide the phonemes. 

16. A computer readable carrier including computer program instructions 
that cause a computer to implement a method for displaying text information corresponding 
to a speech portion of audio signals of a television program to as a closed caption on an video 
display device, the method comprising the steps of: 

decoding the audio signals of the television program; 

filtering the audio signals to extract the speech portion; 

parsing the speech portion into discrete speech components in accordance with 
a speech model and grouping the parsed speech components; 
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9 identifying words in a database corresponding to the grouped speech 

10 components; and 

11 converting the identified words into text data for display on the display device 

12 as the closed caption. 

1 17. A computer readable carrier according to claim 16, wherein the 

2 computer program instructions that cause the computer to perform the step of filtering the 

3 audio signals are configured to control the computer concurrently with the computer program 

4 instructions that cause the computer to perform the step of decoding the audio signals of the 

5 television program and with the computer program instructions that cause the computer to 

6 perform the step of parsing the speech signals of the television program. 

1 18. A computer readable carrier according to claim 16, wherein the 

2 computer program instructions that cause the computer to perform the step of parsing the 

3 speech portion into discrete speech components include computer program instructions that 

4 cause the computer to use a speaker independent model to provide individual words as the 

5 parsed speech components. 

1 19. A computer readable carrier according to claim 16 further including 

2 computer program instructions that cause the computer to format the text data into lines of 

3 text data for display in a closed caption area of the display device. 

1 20. A computer readable carrier according to claim 16, wherein computer 

2 program instructions that cause the computer perform the step of parsing the speech portion 

3 into discrete speech components include computer program instructions that cause the 

4 computer to use a speaker dependent model to provide phonemes as the parsed speech 

5 components. 



